منابع مشابه
Comparison of Visual and Logical Character Segmentation in Tesseract OCR Language Data for Indic Writing Scripts
Language data for the Tesseract OCR system currently supports recognition of a number of languages written in Indic writing scripts. An initial study is described to create comparable data for Tesseract training and evaluation based on two approaches to character segmentation of Indic scripts; logical vs. visual. Results indicate further investigation of visual based character segmentation lang...
متن کاملTesseract Ocr: a Case Study for License Plate Recognition in Brazil
This paper presents the analysis of Google’s Tesseract OCR for license plate recognition in Brazil. The performance results presented for Tesseract OCR will be compared to market grade OCR products known here as “A” and “B”. This is a necessary measure due to a confidentiality agreement with the company supporting this research. The use of OpenCV is also considered due to limitations inherent t...
متن کاملDevelopment of a multi-user handwriting recognition system using Tesseract open source OCR engine
The objective of the paper is to recognize handwritten samples of lower case Roman script using Tesseract open source Optical Character Recognition (OCR) engine under Apache License 2.0. Handwritten data samples containing isolated and free-flow text were collected from different users. Tesseract is trained with user-specific data samples of both the categories of document pages to generate sep...
متن کاملShirorekha Chopping Integrated Tesseract OCR Engine for Enhanced Hindi Language Recognition
Tesseract OCR Engine is one of the most efficient open source OCR engines currently available. Recently, Tesseract OCR 3.01 is capable of recognizing Hindi language but still it needs some enhancement to improve the performance. The Hindi language recognition accuracy is quite low even for the printed text, as the conjunct character combinations of Hindi Language are not easily separable due to...
متن کاملRecognition of handwritten Roman Numerals using Tesseract open source OCR engine
The objective of the paper is to recognize handwritten samples of Roman numerals using Tesseract open source Optical Character Recognition (OCR) engine. Tesseract is trained with data samples of different persons to generate one user-independent language model, representing the handwritten Roman digit-set. The system is trained with 1226 digit samples collected form the different users. The per...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Programming Historian
سال: 2015
ISSN: 2397-2068
DOI: 10.46430/phen0042